Introduction To Data Science
- The life cycle of Data Science
Statistics
- Statistical Learning
- Measures of central tendency
- Measures of dispersion
- Probability theory
- Hypothesis testing,
- ANOVA
- Types of graphs and
- plots
R Programming
- R Environment Setup and Essentials
- Installing R for the Windows, Linux and Mac
- Exploratory data analysis
- Basic operators in R
- Data Manipulation
- Data visualisation
- Followed byHands-on Exercise
Python
- Python language Basic
- Constructs
- OOP concepts in Python
- Hands-on Exercise – important concepts in OOP like polymorphism, inheritance, encapsulation, Python functions, return types, and parameters, Lambda expressions
- NumPy for mathematical computing
- Hands-on Exercise – How to import NumPy module, creating an array using ND-array
- calculating standard deviation on an array of numbers, calculating the correlation between two variables.
- SciPy for scientific computing
- Hands-on Exercise – Importing of SciPy, applying the Bayes theorem on the given dataset.
- Matplotlib for data visualization
- Hands-on Exercise – deploying MatPlotLib for creating Pie
- Scatter, Line, Histogram.
- Pandas for data analysis and machine learning
- Hands-on Exercise – working on importing data files, selecting record by a group, applying a filter on top, viewing records, analyzing with linear regression, and creation of time series.
- Python Environment Setup and Essentials Installing Python Anaconda for the Windows, Linux and Mac with Hands-on Exercise
Machine Learning
- Introduction to Machine
- Learning with R and
- Python
- The need for Machine Learning,
- Introduction to Machine
- Learning, types of Machine
- Learning, such as supervised
- unsupervised and reinforcement learning, why Machine Learning with Python, R and applications of Machine Learning.
- Supervised Learning and
- Linear Regression
- Hands-on Exercise – Implementing linear regression from scratch with R and Python, Using Python library Scikit-learn to perform simple linear regression and multiple linear regression, Implementing train– test split and predicting the values on the test set.
- Classification and Logistic Regression
- Hands-on Exercise – Implementing logistic regression from scratch with R and Python, Using Python library Scikit-learn to perform simple logistic regression and multiple logistic regression, Building a confusion matrix to find out the accuracy, true positive rate, and false-positive rate.
- Decision Tree and Random Forest
- Hands-on Exercise – Implementing a decision tree from scratch in R and Python, Using Python library Scikit-learn to build a decision tree and a random forest, Visualizing the tree and changing the hyperparameters in the random forest.
- Naïve Bayes and Support Vector Machine
- Hands-on Exercise – Using Python library Scikit-learn to build a Naïve Bayes classifier and a support vector classifier.
- Unsupervised Learning
- Hands-on Exercise – Using Python library Scikit-learn to implement K-means clustering, Implementing PCA (principal component analysis) on top of a dataset.
Deep Learning As Part AI
- Natural Language Processing and Text Mining
Capstone
- Project Time Series Analysis
- Hands-on Exercise – Analyzing time series data, the sequence of measurements that follow a non-random order to recognize the nature of the phenomenon, and forecasting the future values in the series.
Tableau
- Tableau for data visualisation:
- Tableau Introduction
- Working on data with Tableau
- Dashboards using Tableau (hands-on)
- Stories in Tableau (hands-on)